Dynamic category compression in a data storage library

ABSTRACT

Methods and apparatus are provided for dynamically compressing categories in a data storage library. In one embodiment, the method includes retrieving an identification of a first category in the data storage library, the first category being a last-compressed category. Next, an identification number of a first order of the first category is retrieved, the first order being a last compressed order. Compression is resumed of orders in the first category with an order next following the first order and continued with additional orders in the first category. If a predetermined amount of time has elapsed, the identification of the first category and the identification number of the order of the first category being compressed are stored. If, however, the predetermined amount of time has not elapsed and compression of the first category is complete, compression of a second category is begun.

TECHNICAL FIELD

The present invention relates generally to data storage libraries and, in particular, to dynamically compressing category orders of logical volumes.

BACKGROUND ART

Many data processing systems require large amounts of data storage space and are configured in a hierarchical manner. More frequently accessed data is stored in high speed but expensive memory, such as in direct access storage devices (DASD), while less frequently accessed data is stored in slower speed but less expensive memory, such as on tape media in automated storage libraries. One such system is a virtual tape system (VTS) in which logical volumes, sometimes numbering in the hundreds of thousands, are written to tape cartridges. As illustrated in FIG. 1, a logical volume may be stored entirely in a single cartridge (such as logical volume X being stored in cartridge A) or may span two or more cartridges (such as logical volume Y being stored in cartridges A and B). When a volume is added to the library, it is assigned an “order” in a “scratch” category. New orders are appended sequentially to the end of the scratch category. FIGS. 2A and 2B illustrate a new volume F being added to the end of the scratch category as order 73. When data is to be written to a volume, the volume is moved to a “private” category and assigned a new order appended to the end of the private category. FIGS. 3A and 3B illustrate two volumes F and G being added to the end of the private category as orders 104 and 105, respectively.

For any of a variety of reasons, data in a volume may no longer be needed and the volume is moved back from the private category to a scratch category, again being assigned a new order appended to the end of the scratch category.

When a volume is moved from one category to another (from scratch to private or from private to scratch), a vacancy is left in the sequence of orders. In FIGS. 3A and 3B, vacancies are left in the scratch category between orders 72 and 74 and between orders 241 and 243 when volumes F and G are moved to the private category. The vacated orders in a category are never reassigned and consequently the orders in the categories may quickly fragment. When the number of volumes in a VTS is in the hundreds of thousands, the number of orders may reach the millions on a busy system. The large number of vacant orders may significantly compromise performance of the database.

SUMMARY OF THE INVENTION

The present invention provides methods, apparatus, computer program products and methods for deploying computing infrastructure for dynamically compressing categories in a data storage library. In one embodiment, the method includes retrieving an identification of a first category in the data storage library, the first category being a last-compressed category. Next, an identification number of a first order of the first category is retrieved, the first order being a last compressed order. Compression is resumed of orders in the first category with an order next following the first order and continued with additional orders in the first category. If a predetermined amount of time has elapsed, the identification of the first category and the identification number of the order of the first category being compressed are stored. If, however, the predetermined amount of time has not elapsed and compression of the first category is complete, compression of a second category is begun. Preferably, compression of a category will not occur and compression of the next category begun if the category is in use, is reserved or has an insufficient number of order vacancies.

In another embodiment, the apparatus includes a manager in a data storage library. The manager includes a database of logical volume categories, a processor; and a memory storing program instructions executable in the processor. Each category is capable of containing a plurality of sequentially appended orders. The executable instructions are operable for retrieving from the database an identification of a first category, the first category being a last-compressed category; retrieving from the database an identification number of a first order of the first category, the first order of the first category being a last compressed order; resuming compression of orders in the first category with an order next following the first order; and, continuing compression of orders in the first category. If a predetermined amount of time has elapsed, the identification of the first category and the identification number of the order of the first category being compressed are stored in the database. If, however, the predetermined amount of time has not elapsed and compression of the first category is complete, compression of a second category is begun.

Other features and advantages of the present invention should be apparent from the following description of the preferred embodiments, which illustrates, by way of example, the principles of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates an exemplary arrangement of logical volumes written to two data cartridges;

FIGS. 2A and 2B illustrate the appending of a new volume to a scratch category and the assignment of a new order to the volume;

FIGS. 3A and 3B illustrate the movement of two volumes from a scratch category to a private category and the assignment of new orders to the volumes;

FIG. 4 is a block diagram of an exemplary data storage library in which the present invention may be incorporated;

FIGS. 5A and 5B are a flowchart of a method of the present invention; and

FIGS. 6A, 6B and 6C illustrate the movement of orders within a category during a compression operation of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

FIG. 4 is a block diagram of an exemplary data storage library 400 in which the present invention may be incorporated. The library 400 includes a library manager or controller 410, one or more data storage drives 420 operatively coupled to the manager 410, shelves or the equivalent in which storage cartridges 430 are retained and an electromechanical accessor 440 also operatively coupled to the manager 410 and which, under the direction of the manager 410, transports selected cartridges 430 between the storage shelves and the drives 420. The library manager 410 includes a memory 412 for storing programming instructions, a processor 414 for executing the instructions and a database accessible to the processor 414. As used herein, the term “operatively coupled” may refer to an indirect or functional relationship of two components, devices or subsystems as well as to a direct electrical connection between the two.

Referring now to the flowchart of FIGS. 5A and 5B, the operation of the compression function of the present invention under the direction of the library manager 410 will be described. Preferably, the compression function will operate in the background and be scheduled to commence periodically at predetermined times. For example, the compression function may be scheduled to commence at 15 minutes past each hour and again at 45 minutes past each hour in order to avoid other functions which are scheduled to run on the hour or on the half hour. Consequently, a check is made to determine if the predetermined time has been reached (step 500). If so, the processor 414 retrieves from the memory 412 or database 416 the identification of the last category compressed (step 502) and the state of that compression. If compression of the last category was not complete, the processor 410 retrieves the last compressed order in the last compressed category (step 506). The order is then incremented to the next order (step 508) and compression begins (step 510). A check is periodically made to determine if the time window for performing the compression operation has expired (step 512) even if the category is not completely compressed. If the time has expired, the identification of the last compressed order is stored (step 514) as is the identification of the category being compressed (step 516) and the compression routine is exited (step 518), allowing the library manager 410 to perform other tasks.

If the time has not expired (step 512), another determination is made as to whether the category is completely compressed (step 520). If so (or if such the determination made in step 504 was affirmative), the procedure increments to the next category (step 522). Checks are made to determine whether the new category is in use (step 524), is reserved (step 526) or has an insufficient number of vacancies (step 528). Reserved categories may include volumes on diagnostic cartridges, clean cartridges or any other user-defined category. When the number of vacancies in a category is not sufficiently large such as about 20% to about 40% and preferably about 30% of the total number of orders in the category, it may not be worth spending computing resources to compress the category. If any of these conditions are met, the procedure again increments to the next category (step 522) and the checks repeated. If the category is not in use, not reserved and sufficiently fragmented, another time check is made (step 530). If time has expired, the last compressed order and the category ID are stored (steps 514 and 516) and the routine exits (step 518). Otherwise, the first order of the category is compressed (step 532), the order is incremented (step 508) and the process continues as described.

FIGS. 6A, 6B and 6C illustrate the manner in which a category is compressed. As indicated in FIG. 6 a, there are vacancies in the category where orders 2, 25, 26, 28, 134 and 136 have been removed. During compression, the first vacancy is filled with the next non-vacant order. Thus, in FIG. 6B, the vacancy between orders 1 and 3 is filled with order 3, now reassigned as order 2. Order 4 is moved to the new vacancy created when order 3 was moved, and so forth. If the time period for compression expires before compression of the order is complete, compression is halted and pertinent information is stored. In FIG. 6B, compression was halted before order 132 could be moved. When compression is resumed, it begins with the order after the last order compressed. Thus, order 132 will be moved to the vacancy of order 128 in FIG. 6C and the remaining orders will be likewise compressed.

It is important to note that while the present invention has been described in the context of a fully functioning data processing system, those of ordinary skill in the art will appreciate that the processes of the present invention are capable of being distributed in the form of a computer readable medium of instructions and a variety of forms and that the present invention applies regardless of the particular type of signal bearing media actually used to carry out the distribution. Examples of computer readable media include recordable-type media such as a floppy disk, a hard disk drive, a RAM, and CD-ROMs and transmission-type media such as digital and analog communication links.

The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiment was chosen and described in order to best explain the principles of the invention, the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated. 

1. A method for dynamically compressing categories in a data storage library, comprising: retrieving an identification of a first category in a data storage library, the first category being a last-compressed category; retrieving an identification number of a first order of the first category, the first order of the first category being a last compressed order; resuming compression of orders in the first category with a next order of the first category; continuing compression of orders in the first category; if a predetermined amount of time has elapsed, storing the identification of the first category and the identification number of the order of the first category being compressed; and if the predetermined amount of time has not elapsed and compression of the first category is complete, beginning compression of a second category.
 2. The method of claim 1, further comprising beginning compression of a third category if the second category is in use.
 3. The method of claim 1, further comprising beginning compression of a third category if the second category is a reserved category.
 4. The method of claim 1, further comprising beginning compression of a third category if a vacancy level in the second category has not been met.
 5. The method of claim 1, wherein the categories being compressed are categories of logical volumes.
 6. The method of claim 1, wherein the categories being compressed are categories of physical data storage cartridges.
 7. A manager in a data storage library, comprising: a database of logical volume categories, each category capable of containing a plurality of sequentially appended orders; a processor; and memory storing program instructions executable in the processor and operable for: retrieving from the database an identification of a first category, the first category being a last-compressed category; retrieving from the database an identification number of a first order of the first category, the first order of the first category being a last compressed order; resuming compression of orders in the first category with an order next following the first order; continuing compression of orders in the first category; if a predetermined amount of time has elapsed, storing in the database the identification of the first category and the identification number of the order of the first category being compressed; and if the predetermined amount of time has not elapsed and compression of the first category is complete, beginning compression of a second category.
 8. The manager of claim 7, wherein the instructions further comprise instructions for beginning compression of a third category if the second category is in use.
 9. The manager of claim 7, wherein the instructions further comprise instructions for beginning compression of a third category if the second category is a reserved category.
 10. The manager of claim 7, wherein the instructions further comprise instructions for beginning compression of a third category if a vacancy level in the second category has not been met.
 11. A data storage library attached to a host device, the library comprising: a plurality of removable data cartridges; a data drive for reading and writing logical volumes from and to a data cartridge loaded therein; an accessor for transporting data cartridges between storage slots and the data drive; a database storing a plurality of volume categories to which the volumes are assigned, each volume being associated with a sequentially designated order entry appended to an end of the category to which the volume is assigned; and a library manager operatively coupled to the data drive, the accessor and an external host device, the library manager comprising: a memory; means for retrieving an identification of a first category from the memory, the first category being a last-compressed category; means for retrieving an identification number of a first order of the first category from the memory, the first order of the first category being a last compressed order; means for compressing orders in the first category beginning with a next following order; means for storing in the memory the identification of the first category and the identification number of the order of the first category being compressed if a predetermined amount of time has elapsed; and means for beginning compression of a second category if the predetermined amount of time has not elapsed and compression of the first category is complete.
 12. The library of claim 11, further comprising means for beginning compression of a third category if the second category is in use.
 13. The library of claim 11, further comprising means for beginning compression of a third category if the second category is a reserved category.
 14. The library of claim 11, further comprising means for beginning compression of a third category if a predetermined vacancy level in the second category has not been met.
 15. A computer program product of a computer readable medium usable with a programmable computer, the computer program product having computer-readable code embodied therein for dynamically compressing categories in a data storage library, the computer-readable code comprising instructions for: retrieving an identification of a first category in a data storage library, the first category being a last-compressed category; retrieving an identification number of a first order of the first category, the first order of the first category being a last compressed order; resuming compression of orders in the first category with an order next following the first order; continuing compression of orders in the first category; if a predetermined amount of time has elapsed, storing the identification of the first category and the identification number of the order of the first category being compressed; and if the predetermined amount of time has not elapsed and compression of the first category is complete, beginning compression of a second category.
 16. The program product of claim 15, wherein the instructions further comprise instructions for beginning compression of a third category if the second category is in use.
 17. The program product of claim 15, wherein the instructions further comprise instructions for beginning compression of a third category if the second category is a reserved category.
 18. The program product of claim 15, wherein the instructions further comprise instructions for beginning compression of a third category if a vacancy level in the second category has not been met.
 19. The program product of claim 15, wherein the categories being compressed are categories of logical volumes.
 20. The program product of claim 15, wherein the categories being compressed are categories of physical data storage cartridges.
 21. A method for deploying computing infrastructure, comprising integrating computer readable code into a computing system, wherein the code, in combination with the computing system, is capable of performing the following: retrieving an identification of a first category in a data storage library, the first category being a last-compressed category; retrieving an identification number of a first order of the first category, the first order of the first category being a last compressed order; resuming compression of orders in the first category with an order next following the first order; continuing compression of orders in the first category; if a predetermined amount of time has elapsed, storing the identification of the first category and the identification number of the order of the first category being compressed; and if the predetermined amount of time has not elapsed and compression of the first category is complete, beginning compression of a second category.
 22. The method of claim 21, further comprising beginning compression of a third category if the second category is in use.
 23. The method of claim 21, further comprising beginning compression of a third category if the second category is a reserved category.
 24. The method of claim 21, further comprising beginning compression of a third category if a vacancy level in the second category has not been met.
 25. The method of claim 21, wherein the categories being compressed are categories of logical volumes.
 26. The method of claim 21, wherein the categories being compressed are categories of physical data storage cartridges. 